Housing generally in the United States is expensive relative to the average income in an area such that in 2019 30.2% of households nationwide were considered to be housing cost burdened (spending more than 30% of income on housing) and 14% were severely housing cost burdened (spending more than 50% of income on housing). Also worrying is that there are large racial disparities in being housing cost burdened and in homeownership.
Historically, some disadvantaged minority groups were redlined into areas where it was difficult to receive loans for buying a house, causing marginalized communities to have a harder time gaining capital. Redlining provided the opportunity for factories, highways, and other industrial buildings to be built in these areas. The result of this is that marginalized communities, often black and brown, had increased rates of asthma, heat related illnesses, and heart and lung complications. Today, this history lives on as people who live in these areas are exposed to disproportionate levels of environmental risk.
In this project I want to explore how environmental risks, housing burden, and race correlate with each other in the bay area. The data used comes from CalEnviroScreen 4.0 (containing environmental risks and housing burden), EJScreen (a nationwide EJ dataset made by the EPA containing environmental risks), the Zillow Home Value Index (Zillow’s measure of home value), and the ACS (for race data). The geographic granularity varies by dataset: CES 4.0 is on the tract level, EJScreen is on the block group level, and Zillow is on the ZCTA5 level. A cost of comparing these datasets with each other is a loss in granularity (eg. comparing EJScreen and Zillow on the ZCTA5 level). Below are 2 maps each of housing burden and environmental risk in the bay area, 3 race equity analyses, and simple regressions between environmental risk and housing burden. On the Shiny app there are additional regressions for every environmental factor from CES 4.0 in addition to every index from EJScreen.
According to CES 4.0, white householders and those identifying as “two or more races” are the least likely to experience environmental risks in the areas they live while black and “some other race alone” (most likely Hispanic/Latino) are most likely to experience these risks. Additionally Asian householders experience slightly elevated risks.
According to EJScreen, “some other race alone” is much more at risk than CES 4.0 suggested and black householders are at less risk than CES 4.0 suggested (although still disproportionately worse off overall). Asian householder, like those identifying as “some other race alone” also experience more risk in EJScreen than CES 4.0. Interestingly the last bin (60-72) suggests that white householders experience the worst environmental risks despite this group fairing well in every other bin and in the CES 4.0 equity analysis. This could be partly explained by the smaller sample size since n is particularly small in this bin.
According to CES 4.0, “some other race alone” experiences the worst levels of housing burden with black and “American Indian and Alaska native alone” coming closely behind. White householders, while still having smaller proportions of housing burden, do not fair as well compared to their levels of environmental risk. Another change from environmental risk is that fewer Asian householders are housing burdened.
##
## Call:
## lm(formula = housing_burden ~ ces_4_0_score, data = ces4_bay)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.766 -3.524 -0.756 2.976 33.997
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.81738 0.26701 33.02 <2e-16 ***
## ces_4_0_score 0.34796 0.01177 29.55 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.648 on 1561 degrees of freedom
## (18 observations deleted due to missingness)
## Multiple R-squared: 0.3588, Adjusted R-squared: 0.3584
## F-statistic: 873.4 on 1 and 1561 DF, p-value: < 2.2e-16
In the regression between CES 4.0 housing burden and pollution burden there is a positive correlation between the 2 variables with a slope of 0.34796. The standard error, 0.01177, is relatively low but it is important to recognize the large residuals that are present and add variance. In the Shiny app there are many more regressions available. Some notable ones to explore are lead, asthma, low birth weight, and cardiovascular disease.
##
## Call:
## lm(formula = relative_cost_housing ~ index, data = ej_zillow)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.574 -2.613 -0.506 1.804 48.868
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.811031 0.213143 55.41 <2e-16 ***
## index 0.015611 0.008216 1.90 0.0575 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.286 on 7131 degrees of freedom
## (105 observations deleted due to missingness)
## Multiple R-squared: 0.000506, Adjusted R-squared: 0.0003659
## F-statistic: 3.61 on 1 and 7131 DF, p-value: 0.05747
This Zillow/EJScreen regression shows little correlation between the relative cost of housing in an area to the income in that area to environmental risks. This is likely due to issues in my own methodology and my use of Zillow’s dataset. Unfortunately, Zillow’s most granular option is ZCTA5, which is only available for their dataset that considers ALL home values in a zip code. This means that expensive homes can heavily skew results. Additionaly, 1 zip code can contain high levels of variance in housing burden and environmental risk, making this sort of regression less significant. On the Shiny app most regressions show a slight positive correlation, but CES 4.0 yields more promising results.